A Novel Uncertain Fuzzy C-Means Clustering Technique Using Genetic Algorithm (UFCM-GA)
نویسندگان
چکیده
In computer science, uncertain data is the notion of data that contains specific uncertainty. Uncertain data is typically found in the area of sensor networks. When representing such data in a database, some indication of the probability of the various values. There is a growing awareness of the need for database systems to be able to handle and correctly process data with uncertainty. The uncertainty is normally evaluated as probability density functions. Beyond storing and processing such data in a DBMS, it is necessary to perform other data analysis tasks such as data mining. Fuzzy clustering is a class of algorithms for cluster analysis in which the allocation of data points to clusters is not "hard" (all-or-nothing) but "fuzzy" in the same sense as fuzzy logic. A genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used to generate useful solutions to optimization and search problems. Genetic algorithms belong to the larger class of evolutionary algorithms (EA), which generate solutions to optimization problems using techniques inspired by natural evolution. In this paper we proposed Uncertain Fuzzy C-Means Clustering using Genetic Algorithm (UFCM-GA). Our proposed mechanism is applicable to any uncertainty region. The experimental results analysis showed the effectiveness compared with existing works. Keywords-Uncertain Data Mining, Data Uncertainty, UFCM, Genetic Algorithm.
منابع مشابه
Proposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms
In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...
متن کاملLand Cover Classification using GA based Fuzzy Clustering Techniques for Remotely Sensed Data
Remote Sensing Imagery is used by the Government and private agencies for the wide range of applications from military to farm development. Fuzzy c-means clustering is an effective algorithm, but the random selection in center points makes iterative process falling into the local optimal solution easily. In this Paper, a novel clustering method is developed using GA based clustering techniques....
متن کاملA Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data
The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is pres...
متن کاملADAPTIVE NEURO FUZZY INFERENCE SYSTEM BASED ON FUZZY C–MEANS CLUSTERING ALGORITHM, A TECHNIQUE FOR ESTIMATION OF TBM PENETRATION RATE
The tunnel boring machine (TBM) penetration rate estimation is one of the crucial and complex tasks encountered frequently to excavate the mechanical tunnels. Estimating the machine penetration rate may reduce the risks related to high capital costs typical for excavation operation. Thus establishing a relationship between rock properties and TBM pe...
متن کاملAutomatic segmentation of dermoscopy images using self-generating neural networks seeded by genetic algorithm
A novel dermoscopy image segmentation algorithm is proposed using a combination of a self-generating neural network (SGNN) and the genetic algorithm (GA). Optimal samples are selected as seeds using GA; taking these seeds as initial neuron trees, a self-generating neural forest (SGNF) is generated by training the rest of the samples using SGNN. Next the number of clusters is determined by optim...
متن کامل